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ABSTRACT 

We use the Hubble Ultra Deep Field to study the galaxy luminosity-size (M-R e ) distri- 
bution. With a careful analysis of selection effects due to both detection completeness 
and measurement reliability we identify bias-free regions in the M-R e plane for a series 
of volume-limited samples. By comparison to a nearby survey also having well defined 
selection limits, namely the Millennium Galaxy Catalogue, we present clear evidence 
for evolution in surface brightness since z ~ 0.7. Specifically, we demonstrate that 
the mean, rest-frame -B-band (/x) for galaxies in a sample spanning 8 magnitudes in 
luminosity between Mb = —22 and —14 mag increases by ~1.0 mag arcsec -2 from 
z ~ 0.1 to z ~ 0.7. We also highlight the importance of considering surface brightness 
dependent measurement biases in addition to incompleteness biases. In particular, the 
increasing, systematic under-estimation of Kron fluxes towards low surface bright- 
nesses may cause diffuse, yet luminous, systems to be mistaken for faint, compact 
objects. 
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1 INTRODUCTION 

The observational properties of galaxies at any given epoch 
■ are a direct result of their formation and evolution histories. 
Theories of galaxy formation traditionally adhere to one of 
two vastly different paradigms — either monolithic collapse 
(MC) or hierarchical clustering (HC). In the MC scenario 
all galaxy types and sizes are created at hig h redshift by 
the r a pid collapse of primordial gas clouds l|Eggen et al.l 
1 19621 ; iLarsonl |l975). Subsequent evolution is then pri- 
marily passive with minimal interaction between nearby 
neighbours. Alternatively, under the HC scheme larger 
galaxies are progressively built from smaller ones during 
the hie rarchical merging of their host dark matter (DM) 
haloes (| White fc Reesl 1 1978ft . Within the HC framework, 
disc galaxies are the first morphological types formed in 
the early universe, while ellipticals are constructed from 
mergers of similar-sized discs over rou ghly a Hubble time 
|Toomre fc Toomrdll972l ; lToomrelll977ft . 

Numerous observational studies have provided evi- 
dence to support or contradict each of these paradigms, 
and recent revisions to both theories have been made in 
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light of such findings. For instance, the homogeneity in the 
early-type colou r-magnitude relation from different clusters 
jAndreonl 2002) suggests that these galaxies formed at 
2 > 2, which favours MC o ver HC. However, a number 
of imaging studies (such as iBarger et al. 1 1 19991 ) have re- 
ported a deficit of distant ellipticals with passively-evolving 
colours — contradicting the historic MC scenario of a single 
burst of star-formation in the early u niverse. A so - called 
'reformed monolithic collapse' model (ISchade et al.l Il999ft 
has the majority of stars forming at high redshift, but 
with secondary episodes of star-format i on at low redshift 
caused by internal processes. iBell et all (|2004ft advocate an 
adaptation of HC by incorporating 'dry mergers', in which 
the brightest ellipticals grow in size by gas-poor mergers 
with other ellipticals. This picture is motivated by their 
discovery of a factor of 2 increase in stellar mass on the 
red (i.e., passively evolving) sequence since z ~ 1 in the 
COMBO-17 survey data. This result is consistent with stud- 
ies of partially-depleted cores in elliptical galaxies, which 
indicate such galaxies have experienc ed, on average, one dry 



me rger dCrahaml 2004; 

as 



Merritl 12006ft Som e authors (such 
Cavon et all 1 19961 ; | Bouwens et al.l 



. .Lacev fc Falll IT9IL 
1997F have developed theories outside of either paradigm 
using the so-called 'backward' approach, whereby high 
redshift galaxy properties are inferred from detailed studies 
of the star-formation history of the Milky Way and other 
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local galaxies. They offer the alternative 'infall' model 
whereby a radially-dependent global star-formation rate 
me ans galaxies form from the inside out. Others, such 
as iDriver et al.l (|2006l ). advocate in a purely qualitative 
manner a mixed model in which bulges form first via a 
rapid collapse or merger phase forming the bulge with 
subsequent disc formation through splashback and infall. 

Recent multi-wavelength, deep imaging surveys (e.g. 
the HDFs, GOODS, GEMS, COSMOS and the UDF) have 
provided a wealth of data for empirical studies of high 
redshift galaxy evolution, which can test and constrain 
the above-ment i oned formation theories. For example, 
ISomerville et al.1 (|2004T ) compare the photometric redshift 
distribution and morphologies of galaxies in the GOODS 
southern field to theoretical expectations. In the important 
z > 1.5 regime where the models strongly diverge, they 
observe an excess number density relative to the MC 
prediction and a deficit relative to the HC one. However, 
the disturbed morphologies of objects in their high redshift 
sample are interpreted as evidence in f avour of the g enera l 
framework of hierarchical formation. iDaddi et al.1 (120051 ) 
present a selection of galaxies at 1.4 < z < 2.5 in the UDF 
with compact, early- type morphologies and spectral energy 
distributions (SEDs) consistent with passively evolving 
stellar populations. They demonstrate that the space 
density of these galaxies at (z) — 1.7 is only a factor of 
2-3 smaller than that of their local counterparts. At first 
glance the prevalence of such galaxies in the early universe 
is difficult to explain within the HC theory, in which 
luminous ellipticals should be th e last galaxy types to form. 
However, |Pe Lucia et al.l (|2006l ) argue that 'down-sizing' 
behaviour, for elliptical galaxy star formation, is actually an 
inherent property of hierarchical formation in their ACDM 
cosmological simulations. 

A number of authors have attempted to quantify 
luminosity and size (or surface brightness) evolution in the 
galaxy population to high redshift using deep imaging sur- 
veys. For bright galaxies (Mb < — 18 mag), the distributions 
of these two key observables are well constr ained locally — 
both i ndividually, as the lu minosity function (|Norberg et al 



2002 1; iBlanton et"afl 120031 ) and size function |Shen et al 
20031 ) , and in bi-var iate space, as t h e luminosity-size 



distribution (or L SD, ICross et al.l l200ll ; IShen et al.1 [20031 ; 
I Driver et al.l 120051 ). It is notoriously difficult to make 
robust comparisons between galaxy samples drawn from 
different epochs due to the impact of redshift and surface 
brightness dependent selection effects, such as the (1 + z) 4 
cosmological surface brightness dimming. Fortunately 
though, the luminosity-size plane is the natu ral domain in 
which to confront such observational biases (|Pisney|[l97rj ; 
IPhillipps fc Disnevl Il98fj ; iBovce fc Phillippsl ll995T) . and 
deep, space-based imaging can push back the low surface 
brig htness, faint magnitude and c ompa ct size boundaries 
Csee lDriverlll999l ). iMcIntosh et al.1 (|2005l ) study early-type, 
red galaxies in the GEMS survey and find evidence for 
evolution in the LSD consistent with the passive fading of 
ancient stellar populations. In particular, they report a ~1.0 
mag increase (U-band) at small sizes (0.5 < R50 < 1-0 h -1 
kpc) to z = 0.7 and a ~0.7 mag increase for larger sizes to 
z = 1.0. Considering disc-dominated galaxies in the GEMS 



survey, iBarden et ail (|2005l ) find strong evolution of ~1.0 
mag arcsec -2 in U-band surface brightness to z ~ 1, but 
a constant stellar-mass-size relation over this time. Their 
results best fit the predic ti ons o f the infall model of galaxy 
formation. iTruiillo et ail (|2005l ) have recently combined 
deep, near-IR imaging of the HDF-S and MS1054-03 fields 
with the SDSS (z ~ 0.1) and GEMS (z ~ 0.2 - 1) surveys. 
They present evidence of size evolution at fixed luminosity 



of (1 + . 



-0.84±0.05 



for early- types and (1 + z) 



-1.01±0.08 



for 

late-types (in rest-frame U-band) out to z ~ 3, and reach 
similar conclusions to the other two studies mentioned 
above. 

In this paper we u se the unprecedented depth of 
the UDF ACS images (|Beckwith et al.1 120061) with the 
supp orting GOODS project NICMOS l|Thompson et al.1 
120051 ) and ISAAC (Vandame et al, in prep.) observations 
to study the galaxy LSD out to high redshift. Careful 
attention is paid to the relevent selection effects to define 
a bias free region of parameter space in the absolute 
magnitude-size plane at a range of redshift intervals. A 
series of volume-limited samples of UDF galaxies is then 
defined and the corresponding LSDs constructed. These 
are compared to a local benchmark from the Millennium 
Galaxy Catalogue (MGC) and the degree of evolution 
quantified. Finally, we compare our method and results to 
those of other recent studies in this field. The outline of 
this paper is as follows. Section 2 contains a description of 
the dataset, as well as the measurement of photometric and 
structural parameters. The selection limits are defined in 
Section 3, and in Section 4 we present evidence of galaxy 
evolution. Section 5 presents the comparison of our results 
to others and in Section 6 we summarise our work and give 
conclusions. A cosmological model with flo = 0.3, S7a = 0.7 
and Ho = 100 km s _1 Mpc^ 1 is used throughout. These 
specific values of the cosmological parameters were adopted 
for ease of comparison between the present UDF work and 
the slightly older MGC results. Unless otherwise stated, all 
magnitudes are given in the AB system. 



2 THE UDF DATA 

The Hubble Ultra Deep Field (UDF) consists of an 11 
arcmin 2 patch of sky centred on RA = 03° 32' 39.0", Dec 
= -27° 47' 29.1" (J2000) in the region of the Fornax 
Constellation. The publicly released ACS/WFC Combined 
Images (version 1.0) span the optical wavelength range 
3700 to 10,000 A in four wide-band filters ; F435W (B), 
F606W (V), F775W (i) and F850LP (2). The F775W 
i-band image has the longest total exposure time of 347,110 
s (144 orbits). Each single exposure was half an orbit in 
duration with the pointing cycled through a four part 
dither pattern. Each image has been processed through the 
standard HST data pipeline and drizzled to a pixel scale of 
0.03"/pixel. An i-band selected catalogue of 10,040 sources 
(h_udf_wfc_Vl_cat, hereafter referred to as 'the on-line 
catalogue') is also included in the version 1. data release — 
detail s of its construction are presented in iBeckwith et alJ 
(2006). However, as no Kron magnitudes were provided we 
elected to generate our own object catalogue as described 
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below. This also allowed us to make our own decisions 
regarding deblending of irregular sources. 



2.1 Source detection 

A preliminary source extraction was performed using the 
Starlink implementat ion (Extractor VI . 4-3) of the popular 
SExtractor package (I Bert in fc Arnoutsl 1 19961 ). The thresh- 
old for both detection and analysis of our objects was set to 
a constant surface brightness of 27.395 mag arcsec -2 and 
a uniform background adopted. The minimum number of 
connected pixels to register a detection was set to 9, which is 
consistent with the size of the PSF FWHM (~0.084"). Two 
parameters critical to object deblending are the minimum 
cont rast value and the numbe r of deblending sub-thresholds 
(see iBertin fc Arnoutsl [l996). Our choices were identical 
to those used in generating the on-line catalogue, namely 
0.03 and 32 respectively. An abnormally high number 
of spurious detections were found along the edge of the 
image mosaic where there are fewer stacked exposures and 
the signal-to-noise is poor. Objects with centroids inside 
these regions, which extend ~100 pixels (3") in from the 
field boundary, were removed from the source catalogue. 
The constant background approach was chosen to avoid 
additional biases against faint, extended galaxies that can 
arise in a mesh-based subtraction. Variation of the mean, 
local background level over the science-grade i-band image 
was found to be roughly two orders of magnitude smaller 
than the width of the background noise distribution. As 
such it will have a negligible effect on the recovered mag- 
nitudes. The 27.395 mag arcsec -2 level for our detection 
and analysis threshold was chosen to be similar to that 
used in extracting the on-line catalogue, which was set to 
0.61 times the RMS background noise. Excluding the low 
signal-to-noise boundary region decreases the measured 
width of the background noise. Thus, although 27.395 mag 
arcsec -2 corresponds to 0.61 times the full field RMS, it 
gives an effective limit of 0.91 times the RMS of the interior 
region actually used for the study. 



2.2 Comparison with on-line catalogue 

A sub-sample of 2532 sources with an i-band Kron mag- 
nitude brighter than 28.0 mag was selected from our 
preliminary detection list (1.5 mag brighter than the 
nominal completeness limit derived from the turnover of 
the counts). The positions of these objects were compared 
to those in the on-line catalogue. A total of 125 had 
centroids in disagreement by more than 5 pixels (0.15"). 
These objects, as well as the 50 largest galaxies, were 
visually inspected. A pseudo-colour image was generated 
by combining the V-, i- and 2-band WFC/ACS observa- 
tions. In each case, our segmentation image was viewed 
alongside the on-line one, as well as its colour and i-band 
counterparts. SExtractor appeared to have erroneously 
deconvolved a single galaxy into multiple sections for 24 of 
the 50 largest galaxies in our catalogue. These were restored 
and 65 redundant sub-components deleted. Sixty of the 125 
objects with mis-matched centroids were also thought to 



have been poorly deblended. The most common problem 
(45 instances) was under-deblending where an apparently 
close pair of galaxies displayed markedly different colours, 
suggesting a line-of-sight overlap at different redshifts 
rather than a single object or merger. The reverse was 
true for 9 galaxies with dual nuclei over-deblended. There 
were also 6 false detections caused by the diffraction 
spikes of bright stars. SExtractor was rerun with four 
alternative minimum deblend contrast parameter settings 
(0.001, 0.01, 0.1 and 0.5) to fix these problems. The final 
sample of 2497 objects brighter than 28th magnitude will 
be refered to as iUDF-BRIGHT. Although the expected 
10cr limiting magnitude for point sources in the UDF 
i-band image is 29.2 mag l|Beckwith et alj|2006i ). we adopt 
the more conservative 28th magnitude cut-off to ensure 
a reasonable completeness and reliability in the detection 
of extended sources (see Section 3.1). An initial round 
of star-galaxy separation was performed at this stage 
using SExtractor's 'stellaricity' index — a value between 
(galaxy) and 1 (star) assigned by an artificial neural 
network routine for classifying objects. There is a clear 
bimodality in the distribution of output stellaricity vs. 
magnitude for iUDF-BRIGHT objects down to 27th mag 
and we identify 21 certain stars with indices greater than 
0.95. We also confirm another 5 over-exposed stars through 
a visual inspection of objects brighter than 22nd mag 
with indices greater than 0.8. Beyond 27th mag, where 
there are numerous cases of intermediate stellaricity, we 
rely upon our photometric redshifts to establish object type. 



2.3 Photometric comparison 

Here we compare our photometry with that of the on-line 
catalogue. Fig. Q] contains a plot of the difference between 
isophotal magnitudes computed for the iUDF-BRIGHT 
galaxies and the on-line values as a function of our preferred 
Kron values. There is a slight difference between the limit- 
ing isophote used to compute object flux in each catalogue, 
and w e use a constant background whereas iBeckwith et al.l 
(2006) use RMS weight maps. However, with the exception 
of a number of outliers that were deblended differently, 
both measurements are generally very similar. The faintest 
objects show the largest discrepancy with the 27-28 mag 
bin having a 3er-clipped mean difference of -0.06 mag and a 
standard deviation of 0.07 mag. This level of disagreement 
is not considered significant or problematic given our 
alternative extraction procedure. 



2.4 Half light radii measurements 

The half light radii described in this paper will be defined 
as the semi-major axis of the elliptical aperture containing 
half an object's Kron magnitude flux. This parameter was 
calculated using a Fortran program that iteratively refines 
the size of a test elliptical aperture and sums the enclosed 
light. The position angle and ellipticity of the test aperture 
used are those derived by SExtractor during the object de- 
tection and analysis phase. Flux contamination from nearby 
objects was avoided by excluding pixels attributed to other 
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Figure 1. Differences between the measured i-band isophotal 
magnitudes of objects in the iUDF-BRIGHT catalogue and those 
of the on-line one. This data is plotted as a function of the iUDF- 
BRIGHT Kron magnitudes used to select our m < 28.0 mag 
sample. The mean difference in each magnitude bin is overplotted 
as a grey square with la error bars after 3cr-clipping. 



sources in the segmentation image. Pixels on the boundary 
of the test aperture were split into a 10 x 10 grid in order to 
estimate their fractional contribution to the enclosed light. 
The half light radii of compact objects are affected by the 
blurring effect of the diffraction limited PSF. Using the ar- 
tificial galaxy simulations described in Section 3.2 we find 
we can recover the true (i.e., intrinsic) size to within 25 per 
cent accuracy down to 0.05" by correcting the measured size 
according to 



R, 



e, intrinsic 



= R, 



e. measured 



where a = 0.30 and T = 0.084" is the PSF FWHM. The 
value of a was chosen to optimise the accuracy of the re- 
covered sizes. The half light radii derived in this manner 
were used to calculate the apparent mean effective surface 
brightness of our objects via the relation 

Me.app = m Kron + 2.5 log 10 {2lT ) . 

This provides a crude inclination correction assuming zero 
opacity. 



2.5 Redshifts 

Photometric redshifts for objects in the iUDF-BRIGHT 
sample were obtained from two separate sources — from a 
catalogue supplied by B. Mobasher (priv . comm., 2005) and 
from the catalogue of ICoe et al.l (120061 ). hereafter referred 
to as M05 and COB respectively. By deriving alternative 
luminosity-size relations using each catalogue in turn and 
comparing them, we hope to gauge the impact of the 
potentially large inaccuracies of the photometric redshift 



approach on our results. Both sets of red shift est i mates were 
computed using the Bayesian method of lBenitezI (|2000T l. but 
with significant differences in the implementation. Firstly, 
M05 uses a fixed 1" aperture to compute object fluxes in 
each bandpass filter, while C06 use a more sophisticated 
procedure to derive aperture-matched, PSF-corrected 
fluxes. And secondly, alth ough both us e the recalibrated 
SED template library of iBenitez et ail i|2004h . C06 add 
two new blue model starburst templates. These differences 
lead to large disagreements in the redshifts derived for 
many of the iUDF-BRIGHT galaxies. The M05 catalogue 
consists of 7250 z-band selected objects of which 2385 have 
counterparts in iUDF-BRIGHT (2497 in total), and 73 are 
identified as having star-like SEDs (leaving 2312 galaxies). 
All these galaxies have counterparts in the C06 catalogue. 

The iUDF-BRIGHT galaxy redshift distributions from 
each catalogue are shown in Fig. [2] for comparison, and 
they differ substantially. M05 finds strong peaks in the bins 
spanning z = 0-0.125, 0.625-0.75 and 1.875-2.125, whereas 
C06 find peaks at z = 0.5-0.75, 0.875-1.125 and 1.25-1.375. 
It is encouraging, at least, that both detect an overdensity 
corresponding to the wall id e ntified in the wider CDFS at 
2 ~ 0.67 by iLe Fevre et al.l 1 20041 ) in their spectroscopic 
survey. However, Coe et all l|2006h note that they do not 
find the z ~ 0.73 wall identified in the same survey (while 
M05 appears to), although the ability of the photometric 
technique to resolve such close features is questionable. 
The disagreement between the two catalogues concerning 
the redshifts of other regions of overdensity is worrying. 
It appears to result mainly from the differences in SED 
template libraries used since the majority of galaxies in 
these features are matched in C06 by their bluest starburst 
templates. The galaxies in these disputed features are 
also extremely faint and lack spectroscopic redshifts, so it 
is impossible to evaluate the merits of each catalogue in 
this regard. Hence, we duplicate all analyses using both 
catalogues in parallel and later investigate the effect of their 
disagreement on our final results. We can, however, investi- 
gate the accuracy of our photometric redshifts for a small 
number of bright galaxies with published spectroscopic data. 

An on-line master catalogu^U of published spectroscopic 
redshifts for objects in the GOODS CDF-S field (encom- 
passing the UDF) is maintained by Rettura . There are 18 
of th ese from VLT FORS2 observations (Vanzella et al. 
2005) with 'solid' or 'likely' quality flags that match 
iUDF-BRIGHT obje cts. The VIMOS VLT Deep Survey 
(|Le Fevre et al.l I2004T ) provides redshifts with 95 per cent 
or 100 per cent confidence flags for another 24 members of 
our sample. Fig. [3] contains a plot comparing these spectro- 
scopic redshifts to the photometric estimates from the M05 
catalogue. Upon the exclusion of six outliers (from 42), 
the remaining measurements are in close agreement with a 
mean difference, Az — z p hot — z spcc , of 0.012(1 + z spcc ) and 
a standard deviation of just 0.101(1 + z spcc ). In a similar 
comparison to 41 galaxies in the Rettura catalogue for 
which they have 'reliable' photometric redshift estimates 
(as identified by their ODDS and Xmod values) C06 find 



http: / / www.eso.org/science/goods / spectroscopy / CDFS_Mastercat / 
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a much smaller standard deviation of only 0.04(1 + z spe c). 
Whilst this comparison certainly provides an indication of 
the relative accuracy and reliability of the two photometric 
redshift catalogues for bright galaxies, it cannot be extrap- 
olated to evaluate their performance at faint magnitudes. 

The six outliers in the M05 comparison serve as case 
studies of situations in which the photometric redshift 
technique can break down entirely. In most instances 
the problem ultimately stems from incorrect aperture 
magnitudes. For example, two spectroscopically confirmed 
stars (2spec = 0.000) were mis-classified as galaxies at 
Zphot = 0.090 and 0.510 respectively because they saturated 
in one or more of the ACS filters, thereby corrupting their 
flux measurements. (We note that these objects would 
not have been included in our galaxy sample anyway as 
they were previously identified as stars based on their high 
SExtractor stellaricity indices.) Three other outliers had 
their aperture magnitudes spoilt by contamination from 
bright, line-of-sight companions. The remaining outlier 



0.3151, Zphot = 1.170) was for a highly disturbed 



system with multiple nuclei. The most likely problem here 
was that the irregular, young stellar population of this ob- 
ject was poorly represented by any of the standard spectral 
templates used for the photometric redshift calculations. 
The frequency of galaxies with unusually blue, star-forming 
SEDs in the full iUDF-BRIGHT sample is estimated 
to be ^6.0 per cent (see Section 2.6 below), which is a 
relatively minor source of uncertainty in our final results. 
However, it does suggest that the use of sophisticated 
aperture-matched, PSF-corrected fluxes and inclusion of 
extra blue starburst templates in the C06 method is likely 
to be beneficial. 



2.6 K-corrections 

Individual galaxy K-correctio ns, from observed i-band to 
rest- frame MGC filter B-band (|Liske et al.ll20 03l , were com- 
puted as follows using photometric redshifts and broad- 
band magnitudes from both the M05 and C06 catalogues. 
ACS/WFC photometry was available for all galaxies in at 
least two of the four B, V , i and z-bands, plus 49 per cent 
(1132) and 16 per cent (377) of galaxies had additional NIC- 
MOS J and/or i/-band and ISAAC A" s -band photometry 
respectively. Total system throughput curves were obtained 
for the relevant filter plus instrument combinations. These 
were then i ntegrated over th e redshifted synthetic spectral 
templates of lPoggiantil (|l997T ) to generate a series of a r tificia l 
magnitudes in each band. The library of iPoggiantil (I1997T ) 
contains 27 model spectra based on three Hubble types (E, 
Sa and Sc) with stellar population ages in the range 2.2 to 15 
Gyr. An additional flat spectrum was added as a 'catch-all' 
type option for very blue galaxies not adequately represented 
by the original SED library. The best fit template for each 
galaxy was identified via a minimisation of 
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Figure 2. The distribution of photometric redshifts for galaxies 
in iUDF-BRIGHT from the M05 (black outline) and C06 (grey 
shaded) catalogues. The inset figure is a comparison of individual 
galaxy redshift estimates in the range relevant to this study, z = 
0.25 to 1.15. 
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Figure 3. A comparison of the M05 photometric redshift es- 
timates with spectroscopically measured values from Rettura's 
CDF-S Master catalogue (24 matched redshifts from the VIMOS 
VLT Deep Survey and 18 from VLT FORS2). With the exclusion 
of the six outliers (open triangles), redshifts for the remaining 36 
galaxies (solid triangles) are all in close agreement with a mean 
difference, Az = z p hot — Zspcc, of 0.012(1 + z spe c) and a standard 
deviation of just 0.101(1 + z sp ec)- Reasons for the failure of the 
photometric technique on the six outliers are discussed at the end 
of Section 2.5. 



using the quoted errors on the magnitudes in each catalogue. 
A total of 221 galaxies (9.6 per cent) returned minimum % 2 
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values corresponding to probabilities less than 5 per cent. 
These extreme outliers were frequently (~6.0 per cent of our 
sample) best-fit by the default flat spectral template, sug- 
gesting some level of incompleteness in our synthetic SED li- 
brary for galaxies caught during their starburst phase. Once 
the best fit redshifted template, /z(A), was identified, the 
K-correction to MGC B-band was computed according to 



K^bmgc = 2.5 log(l + z) + 2.5 log 



Jo°° /zM0B,MGc(A)a!A 
jo°°/*(TTl)<MA)dA 



where 4>iW represents the i-band filter transmission 
function and 4>b,mgc{^) that of the MGC B-band filter. 

Fig. [4] contains examples of the best fit redshifted 
spectral templates for two galaxies alongside their observed 
magnitudes in each filter. Fig. [5] contains a plot of all 
individual K-corrections as a function of redshift, as well 
as the complete tracks for each model SED. The reddest 
K-corrections are for the elliptical type spectra with 
star-formation timescales of 15 and 13.2 Gyr. The bluest 
correction is that for the flat SED, followed by the 2.2 Gyr 
spiral template. The difference between our reddest and 
bluest K-corrections (from observed i-band to rest-frame 
MGC B-band) is negligable at z ~ 0.7 but grows rapidly 
thereafter with increasing redshift. They range ~2 mag 
by z = 1.5, ~4.5 mag by z — 2 and ~8.5 mag by z — 5. 
This has a strong impact on our selection biases at high 
redshift since the bluest and reddest galaxies of a given 
B-band luminosity will be visible over vastly different 
volumes. One way to combat this problem is to use the 
technique of 'band-pass shifting', i.e. to correct the observed 
magnitude from whichever available filter samples nearest 
to each galaxy's redshifted, rest-frame B-band light. For 
objects beyond z~1.5 this would mean correcting from 
the flux through one of the infrared J, H or K s filters. 
Unfortunately, the GOODS and ISAAC observations in 
these bands are much shallower than the i-band image from 
which our catalogue was selected. For instance, only 37 per 
cent of galaxies in our sample have J-band photometry and 
these will be predominantly redder types, which negates 
the advantages of a smaller K-correction. Thus, for the 
present time we will restrict our investigation to redshifts 
below ~1.5 where the colour bias is less severe. 



2.7 Absolute Quantities 

The apparent magnitude (m) and mean surface bright- 



«M> £ 



of the iUDF-BRIGHT galaxies were con- 



verted into absolute quantities using their photometric red- 
shifts and the K-corrections described above. Luminosity 
distances (Di) in units of Mpc were calculated according 
to an fi = 0.3, fi A = 0.7 and Ho = 100 km s" 1 Mpc" 1 
cosmological model. The relevant formulas are 

M = m — 51og 10 (A(z)) - 25 - K(z), 

for the absolute magnitude and, 

(M> e , abs = <M> e , app - 101og 10 (l + z) - K(z), 

for the absolute mean effective surface brightness. No 
evolutionary correction was imposed as this is the unknown 
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Figure 4. Best-fit redshifted spectral templates for two exam- 
ple sets of observed band-pass magnitudes (black dots) from the 
M05 catalogue. The SED of the first galaxy (top panel) is well 
described by the Sa type temp l ate wi th e-folding time of 7.7 Gyr 
from the library of |Poggianti| jl997i ). The second galaxy (bot- 
tom panel) is well fit by the Sc type template with a 5.9 Gyr 
e-folding time. The transmission functions of all 7 filters indicate 
their wavelength coverages. These have been scaled for clarity 
and are not intended to represent the total relative throughput 
in each band. 



we intend to constrain. 



3 SELECTION EFFECTS 

Any imaging survey is restricted and biased in its sampling 



selection effects 


ilDisnevI 11976 


; iDisnev & PhilliDDsl Il983|; 


Inroev & Bothunl 1 19971; Driver! 


19991; Cross & Driverl I2OO2I; 


Driver et al.l I2005T). The visibility of a particular Ealaxv 



depends on its intrinsic properties (e.g. luminosity, scale 
size, light profile, distance and color) and the nature of the 
survey imaging data (e.g. exposure time, sky brightness, 
noise, bandpass and seeing). Furthermore, the accuracy 
with which a galaxy's true luminosity and scale size may 
be recovered not only depends strongly on the above 
mentioned paramet ers, but also the s pe cific measurement 
techn iques used (I Cross et alj |2004| ; iGraham fc Driverl 
l2005h . Understanding the limitations of the iUDF-BRIGHT 
sample is critical to the robust comparison of absolute 
magnitude-size distributions at different redshifts. We have 
used artificial galaxy simulations to identify a region of 
minimal selection bias in the apparent magnitude-size 
plane. The boundaries of this region are then mapp ed into 
the ab solute magnitude-size plane via the method of [Driver] 
(1999) for a series of volume-limited samples. 
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Figure 5. K-corrections from the observed i-band magnitude to 
the rest-frame MGC B-band magnitude computed using fluxes 
and redshifts from both the M05 (black triangles) and C06 (grey 
squares) catalogues. The grey lines indicate K-correction func- 
tions belongin g to each of the 27 spectral templates used from 
the library of IPoggiantil l|l997h (see Section 2.6). A number of 
these are labelled according to their Hubble type ('el' = ellipti- 
cal, 'sa' = lenticular and 'sc' = spiral) followed by the age of their 
model stellar population in units of 0.1 Gyr. 



is found having a centroid within 5 pixels (0.15") of the 
input simulation position. This search radius was made 
conservatively small to ensure the chance of erroneous 
matches to existing, real objects was negligable. In Tig. [7] 
we plot 'recoverability', which we define to be the number of 
detected galaxies having measured magnitude and effective 
radii within 25 per cent of their input values. And finally, 
in Fig. [8] we produce an 'error vector' diagram showing 
the (3cr-clipped) mean size and direction of the difference 
between input and recovered values for the detected objects 
originating in each bin. Together, the recoverability and 
error vector diagrams allow one to identify any regions of 
the observable parameter space contaminated by galaxies 
with measured magnitudes and half light radii that poorly 
reflect their true, intrinsic properties. 

The completeness results plotted in Fig. [6] indicate that 
we can detect our simulated galaxies over almost the entire 
region of the apparent magnitude-size plane spanned by the 
iUDF-BRIGHT sample. The only area of low completeness 
lies in the upper right corner of these plots and corresponds 
to objects of extremely low mean surface brightness. In fact, 
we detect over 75 per cent of simulated objects in our bins 
out to (fi) e ~ 28.0 mag arcsec -2 . None of the real, observed 
galaxies are measured to lie within this problem area, 
and the galaxy population appears to naturally decline in 
density well before this boundary. One might be tempted 
to conclude from this that our sample is not subject to any 
significant surface brightness dependent selection effects. 
However, we have not yet establisted that our flux and scale 
size measurements are free of bias over the same region of 
parameter space. 



3.1 Apparent Limits — Simulations 

Artificial galaxy simulations are com monly used for es- 
timat i ng survey selection l imits (e.g. Aguerri fc Truiillol 
120021 : iBouwens etaU 120041 : iDriver et all l2005h . However, 
different authors vary significantly in their implementation 
and interpretation of this technique. The method used here 
is as follows. The luminosity-size plane was divided into 
a 21x21 grid covering the relevant observational window. 
For each grid point the IRAF artdata package was used 
to generate 100 artificial galaxies with the corresponding 
size and flux but with random positions in the field and 
axial ratios (between 0.3 and 1.0). We use exponential light 
profiles simulated out to 5 R e and scaled to account for 
the flux lost by this truncation. The simulated galaxy size 
corresponds to the major axis half light radius. Each galaxy 
was convolved with a Gaussian point spread function of 
0.084" FWHM. These objects were inserted into the i-band 
UDF image 25 at a time and SExtractor run to search 
for them. The extraction parameters chosen were identical 
to those used to generate the iUDF-BRIGHT catalogue. 
The half light radii of all detected artificial galaxies were 
measured via the elliptical aperture method. 

Figures[6][7jand[8]display the results of our simulations. 
In Fig. [6] we show sample completeness as a function of 
apparent magnitude and effective radius (i.e., the number 
of artificial galaxies detected of the 100 inserted at each 
grid point). A galaxy is defined as detected if an object 



The recoverability results in Fig. [7J reveal that there 
are biases in the Kron fluxes and half light radii computed 
for both (apparent, not necessarily intrinsically) faint 
and low surface brightness galaxies, even in cells of high 
detection completeness. These biases were not unexcepted, 
since they stem from well- documented pr o blems of the 
Kron magnitude technique (lAndreonl |2002| ; iBenitez et al.l 
12004 iGraham fc DriveJ 120051 ). The Kron magnitudes 
computed by Sextractor are the sum of light enclosed 
within an aperture of radius 2.5 times the luminosity- 
weighted Kron radius (twice the image moment radius). 
In theory this should contain 96.0 per cent of the light of 
a pure exponential profile — a short-fall of just 0.04 mag 
of the true, instrinsic galaxy flux. In practice, however, 
the calculation of the Kron radius is never perfect and is 
systematically under-estimated in galaxy images with few 
effective radii sampled above the background sky, i.e., in 
low surface brightness objects (see lGraham fc Driver! [20051 . 
their Section 2.6). Under-estimating the Kron radius results 
in an under-estimation of total flux. This follows on to an 
under-estimation of the half light radius when the half flux 
value is derived from the Kron magnitude. In constructing 
the recoverability plots of Fig. [7] an adjustment was made 
to allow for the theoretical short-falls in total flux (and, 
hence, scale size) expected for a correctly measured Kron 
radius, namely 0.04 mag and 4 per cent of the scale size. 
The surplus measurement errors that appear in this plot 
are then primarily due to mis-calculation of the Kron 
radius in low surface brightness systems. The error vector 
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plot in Fig. [8] illustrates and confirms the expected sense 
of these errors, which tend to scatter any very low surface 
brightness galaxies detected towards fainter magnitudes 
and smaller sizes. Using these plots we determine the 
bias-free selection limit (fi) = 27.0 mag arcsec -2 below 
which ~75 per cent of objects have reliably recovered 
parameters. This limit is 1 mag arcsec -2 brighter than 
that determined using the simple completeness results. In 
addition, we now find it significantly encroaches on the 
distribution of real, observed galaxies — meaning that the 
iUDF-BRIGHT sample is clearly not entirely free of surface 
brightness dependent selection biases. This illustrates the 
importance of considering the limits on both completeness 
and parameter recoverability. 

The recoverability plots also indicate that cutting our 
sample at 28th mag was a sensible decision as galaxies 
brighter than this (and brighter than the surface brightness 
limit) are recovered at a rate of over 95 per cent in most 
bins. Whereas for fainter galaxies in our 28.0-28.5 mag bins 
the recoverability rate falls to between 75 and 85 per cent 
due to their very low signal-to-noise in the image. There 
is a slight suggestion of a limit on the recoverability of 
compact objects in this plot, but our simulations are not 
very realistic in this regard. In particular, they mask the 
limitations of our crude PSF modelling. In the simulations 
we convolve all our objects with a perfect Gaussian profile 
of FWHM 0.084" and later correct the measured sizes using 
Eqn. El In reality the UDF i-band PSF will neither be 
perfectly Gaussian or of a constant size across the entire 
image — 0.084" was simply the average value computed 
from our brightest, non-saturated stars, and had a standard 
deviation of 0.007". This level of error in our approximation 
to the true UDF PSF is insignificant for the vast majority 
of our sample, but would begin to cause problems in objects 
whose size is similar to that of the PSF (i.e., half light radii 
~0.042"). We thus impose a conservative minimum size 
cut of r m i n = 0.06" to exclude such objects and will look 
to build a more realistic PSF handling scenario into future 
versions of our simulation procedure. 

One would expect there to also be a limit on the 
largest galaxies we could reliably measure in the UDF. 
Very extended sources are more likely to overlap with 
other objects in the line of sight, which leads to deblending 
difficulties. Furthermore, galaxies of comparable size to the 
field of view can cause problems for proper background 
subtraction. However, the simulations reveal that neither 
of these issues prevented the reliable measurement of fluxes 
and half light radii for the largest profile sizes tested here of 
5". As the largest real object in our sample is 1.8" in half 
light radius, our observed distribution is clearly not affected 
by this limit. But we do illustrate a limit at r max = 5" in 
our plots for consistency, and to indicate the direction of 
it's movement relative to the others at different redshifts. 
There is also an effective bright apparent magnitude limit 
on this survey due to the deliberate choice of a field with no 
known bright galaxies. We estimate this from the brightest 
galaxy in our sample, m-bright = 18.26 mag. 

By considering all the limits described above we define 
an observational window in apparent space inside which our 
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Figure 6. 'Completeness' diagram for objects in the UDF i-band 
image computed from our galaxy simulations. In each bin the 
grey scale indicates the number of galaxies detected of the 100 
inserted with that size and magnitude. The real, observed iUDF- 
BRIGHT galaxy population is overlayed as black dots. The low 
surface brightness completeness limit, beyond which the detection 
rate falls below 75 per cent, is also marked. 



sample is complete and structural parameters are reliably 
recovered. We shall refer to this as the bias- free region. The 
full five-sided bias-free region is indicated in Fig. [7] 



3.2 Absolute Limits 

The selection boundaries derived for the apparent 
luminosity-size plane are easily extended to the a bsolute 
regim e for a volume-limited sample via the method of I Driver] 
|l999f ) . We define such a sample by binning our data in nar- 
row redshift intervals (zi ow to Zhigh), as shown in Fig. [9] The 
constraint on faint absolute magnitudes is given by 



10 (A(zhi g h)) - 25 - K re d(zhi g h) 



Mfaint = mfaint - log 

where the applicable K-correction is that of the reddest 
galaxy in our sample. Likewise, the corresponding constraint 
on the most luminous galaxies is 

Mbright = m,b r ight - log 10 (A(ziow)) - 25 - A' bluc (zi ow ) 

using the bluest galaxy K-correction. These limits are 
illustrated in the magnitude-redshift plot of Fig. [9] which 
highlights the problem of working with large K-corrections. 
At high redshift the bluest and reddest galaxies in our 
sample are visible over vastly different volumes, thereby 
diminishing the luminosity range over which we sample all 
spectral types evenly. 
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Figure 7. 'Recoverability' diagram for objects in the UDF i-band 
image computed from our galaxy simulations. In each bin the grey 
scale indicates the number of galaxies detected with measured 
fluxes and half light radii within 25 per cent of their input values 
of the 100 inserted. The real, observed iUDF-BRIGHT galaxy 
population is overlayed as black dots. The low surface brightness 
reliability limit, beyond which the recoverability rate falls below 
75 per cent, is marked in black too. The remaining limits making 
up our five-sided bias free region are illustrated with broken lines. 
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Figure 8. 'Error vector' diagram for objects in the UDF i-band 
image computed from our galaxy simulations. The grey arrow 
eminating from each bin indicates the typical size and direction 
of the systematic error in our flux and scale size measurements. 
Each arrow terminates at the coordinate of the 3<r-clipped mean 
magnitude and half light radius of the detected objects from that 
simulation bin. The real observed iUDF-BRIGHT galaxy popu- 
lation is overlayed as black dots. The bias-free region is marked 
with black lines. 



The apparent surface brightness bound on the bias-free 
region translates to the following absolute surface brightness 
limit, 

Me.abs.lim = <M) e , app ,l im ~ 10 log 10 (1 + Z h igh) - ^rcd(^high). 

The dual impact of the (1 + z) 4 cosmological dimming and 
the growing red K-correction mean that this is potentially 
the most restrictive of the selection limits on distant galax- 
ies. It is shown as a diagonal line on the luminosity-size 
diagrams in Fig. 1101 The maximum and minimum apparent 
half light radii limits (in arcsec) are converted to absolute 
scale sizes (in kpc) via the formula 

max/min t-j / \ 

^max/min ^e,app 7T -^i^low/high/ 1QQQ 

r e,abs 3600 180 (1 + Zlow/high) 2 

4 RESULTS 

4.1 Luminosity-Size Diagrams 

In Fig. [10] we present the B-band luminosity-size distri- 
bution (LSD) of iUDF-BRIGHT galaxies in three narrow 
redshift bins: z = (0.2-0.35), (0.6-0.75) and (1.0-1.15) (as 
shown in Fig.|§). These were chosen to lie at, and either side 
of, the redshift at which the i-band filter samples galaxy 
rest-frame B-band light (i.e., where the K-correction is ap- 
proximately zero). Selection limits on these volume limited 
samples were computed as described in Section 3.2 and are 



overlayed in grey. Our local (z = 0.1) benchmark is derived 
from the MGC B-band bivariate brightness distribution 
(BBD) in absolute magnitude and surface brightness. A 
detailed de s cripti on of the MGC dataset is contained in 
iLiske et all (l2003h and constru ction of the MGC BBD is 
explained in lDriver et all (|2005u . The equivalent MGC LSD 
used here was generated in the same manner, except that 
the data binning and function fitting were performed in 
the L-R e plane rather than L-<n> e . A contour plot of 
the resulting number density of MGC galaxies in the L-R e 
plane is overlain in Fig. QJJfor comparison against our UDF 
samples at higher redshifts. To assist in this, the selection 
boundary of the MGC data (as defined by an isovolume 
contour at 100 Mpc 3 ) is marked in blue . 

It is clear from Fig. \W\ that in the interval z = 
(0.2,0.35) the UDF survey samples a rather different region 
of the luminosity-size plane than the MGC does locally. In 
particular, the extraordinary depth of the UDF imaging 
and the deliberate pointing away from known bright, nearby 
galaxies means that at low redshift it primarily detects 
very faint galaxies (in the range M = —14 to —12 mag). 
The MGC local, bright galaxy sample, on the other hand, 
is limited to objects with apparent _B-band magnitudes 
below 20th mag. This corresponds to a selection limit of 
Mfajnt < —13.9 mag at its median redshift (z = 0.1). In our 
highest redshift interval, z = (1.0,1.15), the region of valid 
comparison on the L-R e plane is also rather small. At these 
high redshifts the bias-free window of parameter space 
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Figure 9. The absolute magnitude of all iUDF-BRIGHT galax- 
ies with z < 2.5 as a function of rcdshift. The values derived 
using the M05 catalogue are shown as black triangles and those 
using the C06 data as grey squares. Long and short-dashed lines 
indicate the upper and lower selection limits in magnitude us- 
ing K-corrections for the bluest and reddest galaxies respectively. 
Volume-limited samples are constructed for three narrow red- 
shift intervals : (a) z = 0.2 - 0.35, (b) z = 0.6 - 0.75 and (c) 
z = 1.0 — 1.15. These are designated by the black rectangles; 
thin lines encompass all objects detected in that redshift range, 
while the thick lines enclose only those objects within the magni- 
tude selection limits encompassing all spectral types (i.e., for all 
K-corrections). 

accessible with the iUDF-BRIGHT sample only covers a 
fraction of the local MGC relation in the bright, compact 
regime. It is in the intermediate redshift sample at z = 
(0.6,0.75) that the UDF and MGC observational windows 
best coincide. Here we sample the full width of the z ~ 0.7 
LSD over almost 9 mags with only a slight bias against 
low surface brightness galaxies for our faintest objects at 
M B ~ —16 to —14 mag. According to the M05 photometric 
redshifts, there are 169 iUDF-BRIGHT galaxies in this 
volume-limited sample that lie within the selection bound- 
aries of both surveys (and 212 for the C06 catalogue). An 
eyeball comparison suggests the UDF objects have a similar 
distribution to the MGC ones, but are somewhat brighter 
and more compact, suggesting moderate evolution in the 
LSD to these redshifts. We quantify this via a 2D K-S test 
below. 



4.2 2-D Kolmogorov-Smirnov Test 

The Kolmogorov-Smirnov (K-S) test provides an estimate 
of the probability that two distributions are drawn from 
the same population. In the one-dimensional K-S test this 
probability is computed from the maximum cumulative 
difference between the two distributions. In its extension 
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Figure 10. Luminosity-size diagrams for each of the three iUDF- 
B RIGHT volume-limited samples : (a) z = 0.2 - 0.35, (b) z = 
0.6—0.75 and (c) z = 1.0—1.15. The bias-free selection boundaries 
(as defined in Section 3.2) are indicated with white lines and 
grey shading. The accessable parameter space within is given a 
white background. The effect of using the low surface brightness 
reliability limit in addition to the simple completeness limit is 
emphasised by plotting both lines and shading the difference in 
a lighter grey. Black contours trace the MGC z = luminosity- 
size relation at number densities of 10~ 5 , 10~ 4 , 10~ 3 and 10~ 2 
Mpc — 3 . The MGC selection boundary as defined by an isovolumc 
contour at 100 Mpc 3 is marked with a thick, black, dashed line. 
The top diagram is constructed using photometric redshifts from 
M05 and the bottom using photometric reshifts from C06. 
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to two dimensions the integrated probability in each of 
four quadra nts around a given point forms the basis of the 
evaluation l)Press! [l992). The implementation used here 
is ks2d2s from the Numerical Recipes library, which is 
valid for sample sizes greater than 20 objects. Its output 
probability estimate is less accurate above values of 20 per 
cent, although probabilities greater than this do correctly 
indicate that the two distributions being compared are very 
similar. Our primary input sample for this test consists of 
the 169 iUDF-BRIGHT galaxies (M05 redshift catalogue) 
within the interval z = (0.6,0.75) and contained inside both 
the UDF and MGC selection limits. The first comparison 
we make is to a set of 1000 galaxies drawn from the MGC 
z ~ 0.1 L-R e BBD using a basic Monte Carlo technique. 
The K-S test result is a probability of 19 per cent that these 
two samples are drawn from the same population, which is 
only a mild degree of similarity by this measure. 

In order to constrain evolutionary scenarios, we exam- 
ine whether scaling the MGC LSD in luminosity and/or 
scale size can produce a higher K-S test probability than 
the case of null evolution. Our method was to generate a 
mock MGC data set of 1000 galaxies for each trial MGC 
LSD scaling and then run the 2-D K-S test to compare 
it to the UDF sample. We do this for a broad range of 
scenarios from galaxies being 1.3 mag fainter to 1.3 mag 
brighter, and from 70 per cent smaller to 70 per cent 
larger. The resulting probability values are displayed as 
a contour plot in Fig. 1111 It is clear from this figure that 
there is a wide range of scalings providing a higher degree 
of similarity to the z ~ 0.7 UDF LSD than the z ~ 0.1 
MGC LSD with no evolution. The best fits are found in 
two separate regions of this parameter space. The first 
corresponds to mainly luminosity evolution with galaxies 
being typically ~0. 7-1.1 mag brighter at z = 0.675 than 
they are at z ~ 0.1, and between ~20 per cent smaller and 
10 per cent larger. The peak of this region is at AL = —0.9 
mag and AR e — —5 per cent. The second region of good 
fit corresponds to galaxies being on average ~0.3 mag 
brighter and ~25 per cent smaller in the past. These two 
likely evolutionary scenarios both equate to similar degrees 
of surface brightness evolution, ~1.0 mag for the first and 
~0.9 mag for the second, which is necessary to bring the 
ridge lines of the two LSDs into agreement. The first case 
with greater luminosity evolution offers a superior fit to 
the bright end of the distribution than the second one. 
Since the bright end of the UDF z = 0.675 LSD has the 
most reliably measured magnitudes and scale sizes and is 
well clear of our selection limits, one should attach greater 
importance to the fit there. As the K-S test does not allow 
for such a weighting to be set explicitly, we simply note that 
AL — —0.9 mag with AR e = —5 per cent is our preferred 
result, but that we cannot rule out the case of AL = —0.3 
mag with AR e — —25 per cent. 

Repeating this analysis with the iUDF-BRIGHT LSD 
derived from the C06 photometric redshift catalogue we 
find a broad agreement with the evolution predicted using 
the M05 redshifts. In particular, the 20% contours of each 
K-S test are very similar and isolate essentially the same 
region of parameter space. The only difference is that the 
C06 results also allow the possibility of galaxies having 
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Figure 11. Results of the 2-D K-S test for different scalings of the 
MGC z ~ 0.1 luminosity-size distribution compared to the iUDF- 
BRIGHT z =(0.6,0.75) sample. The contours show the output 
probability value, which is indicative of the likelihood that the 
higher z UDF LSD could have been drawn from a population 
described by the scaled MGC LSD. The two 50 per cent contours 
enclose our best fit scenarios peaking at AL = —0.9 mag, AR e = 
5 per cent and AL = 0.3 mag, AR e = —25 per cent. To aid the 
reader, lines of constant mean surface brightness evolution are 
marked in grey, spanning null evolution to the case of galaxies 
being 1.5 mag arcsec - 2 brighter in the past. 

been substantially fainter and smaller in the past (by up to 
1.2 mag and -60%). This scenario arises because of the large 
number of faint blue galaxies found at z ~ 0.6—0.75 in the 
C06 analysis. This over-abundance of faint objects relative 
to the local MGC LSD also means that none of our simple 
scaled evolutionary scenarios provide >35% probabilities 
in the K-S test. Until more reliable (i.e., spectroscopic) 
redshifts are available for a significant number of these 
faint systems it will be impossible to properly characterise 
the faint galaxy population at these redshifts. For now we 
can only acknowledge the difficulties we face in this type 
of study and make the best use of the data available to 
us. It is difficult to estimate a formal uncertainty on our 
most likely evolutionary fit because of the inaccuracy of 
the K-S test above 20% probability. However, as our errors 
are overwhelmingly dominated by those in the photometric 
redshift estimates we simply acknowledge the full range 
of acceptable fits for both catalogues (as shown in Fig. 
Ill[) . This includes surface brightness evolution to z ~ 0.7 
spanning a 0.5 mag arcsec -2 dimming to a 1.5 mag arcsec -2 
brightening. 



5 DISCUSSION 

A number of previous studies have explored evolution of the 
galaxy LSD beyond the local universe using deep imaging 
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data. This field of investigation came of age with the launch 
of the Hubble Space Telescope, which allowed the first 
high resolution, optical imaging studies of the high redshift 
galaxy population (e.g. iDriver. Windhorst fc Griffiths!! 19951 ; 
Driver et al] Il995l) . In one of the earliest such studies 
Schade et all (|l995l ^ analysed HST B- and /-band images 
of 32 galaxies at 0.5 < z < 1.2 randomly selected from the 
Canada- France Redshift Survey (CFRS). They identified 
objects in their sample as being similar to the local mix of 
morphological types, based on an eyeball classification, with 
the exception of 9 galaxies dominated by blue compact com- 
ponents. After also performing bulge-disc decomposition, 
they found 15 normal late type galaxies with B/T < 0.5 and 
disc luminosities Mg B — 51og/iso < —19.8 mag. Compared 
to the local Freeman value, the _B-band central surface 
bright ness of these o bject s was higher by 1.2 mag arcsec -2 , 
which Sc hade et alj (|l995l ) attribute to evolutio n of the disc 
galaxy luminosity function. iRoche et al.l (| 19981 ) obtained a 
sample of 270 galaxies by combining spectroscopic redshifts 
from various sources, including the CFRS, with HST 
imaging (B, I and in some cases V and ?7-band) from the 
HDF and other surveys. They investigated the LSD at a 
range of z intervals and found B-band surface brightness 
evolution of 0.95 ± 0.22 mag between z ~ 0.2 and z ~ 0.9, 
with similar evolution for all morphological types. The 
authors explain this with an 'inside-out' style disc formation 
model, whereby the half light radius increases with time. 
However, the results of these studies have been questioned 
by later works in which selection effec ts have been g iven a 
mor e in-depth consideration, such as ISimard et al.l (|l999l ') 
and iRavindranath et all (12004 ) . 



ISimard et all (|l999l ) studied the LSD of a sample of 
190 field galaxies in the Groth Survey Strip with HST V 
and /-band imaging and spectroscopic redshifts from the 
DEEP survey. Bulge-disc decomposition was performed to 
identify disc-dominated systems (B/T < 0.5) and extract 
structural properties. The disc LSD was then constructed 
at a series of redshift intervals and surface brightness 
measured to evolve by 1.3 mag in the rest-frame B-band 
from z ~ 0.2 to 0.8 , a sim ilar amount to earlier s tudies 
such as ISchade et all (| 19951 ) and IRoche et all (|l998l ). The 
authors then recalculated these LSDs with a weighting that 
essentially applied the selection function of the highest 
redshift bin to that of the lower redshift bins. This was 
done in order to account for observational incompleteness 
biases against faint and low surface brightness objects, 
and to ensure the comparison was being made over the 
same range of luminosities at all redshifts. The result was 
that no detectable mean surface brightness evolution was 
observed over the redshift range 0.1 to 1.1 for discs with 
— 19 < Mb" 6 ' — 51og/i7o < —22 mag. This conclusion is 
supported by IRavindranath et alj (|2004h in a study using 
the HST GOODS images and photometric redshifts. They 
also restrict their analysis to the galaxies in the lower 
redshift bins that fall within the selection boundaries on 
their highest redshift bin at z = 1.0 — 1.25. The disc size 
function for objects within these bounds is found to remain 
constant over the range 0.25 < z < 1.25. 

The strict selection function approach has recently 
been criticised for making inadequate use of the information 



provided by the observed gal axy LSD at ea c h red shift. 
In the latest s tudie s, such as iBouwens" et al.l (|2004l ) and 
iBarden et al.1 l|2005l ). the authors instead attempt to 
establish whether or not the surface brightness (or size) 
distrib ution at each redshi ft is biased by incompleteness 
effects. IBarden et al. I i|2005l ) combined HST imaging from 
GEMS with COMBO-17 photometric redshifts to search 
for evolution in the disc galaxy LSD and stellar mass-size 
relations out to z ~ 1. They identified disc-dominated 
systems by their global Sersic index (n < 2.5) and used 
artificial galaxy simulations to estimate their completeness 
function in the apparent magnitude-size plane. The faint 
magnitude limit restricts them to the study of objects 
with My < —20 at z ~ 1.0, and they applied this limit to 
their sample at all redshifts to avoid biases in mean surface 
brightness due to the slope of the LSD. At each redshift 
interval they then constructed histograms of the surface 
brightness distribution, weighting by the completeness 
function, and estalished that each was approximately 
Gaussian. Under the assumption that the disc galaxy 
surface brightness distribution is intrinsically, roughly 
Gaussian and uni-modal at all redshifts, they then went on 
to conclude that they were not missing significant numbers 
of galaxies at any redshift. Finally, they fit a linear relation 
to the mean surface brightness as a function of redshift and 
found a slope of —1.43 ± 0.07 in rest-frame Z?-band, i.e., an 
increase of 0.96 mag arcsec -2 to z = 0.67. 



IBouwens et al.1 (|2004l ) used a slightly different approach 
to establish whether their high redshift galaxy samples are 
affected by low surface brightness incompleteness problems. 
They constructed a UBVi-dropout sample set at redshifts 
z ~ 2.5 - 6.0 from the HDF, GOODS and UDF images. 
For each filter dropout sample they compared the GOODS 
galaxy apparent magnitude-size distribution with that from 
the deeper UDF (and UDF-P) imaging. As the primary 
effect of pushing back the surface brightness completeness 
limits via the additional UDF exposure time was to add 
compact objects at the faint magnitude limit, the authors 
concluded that the shallower GOODS data is essentially 
complete at bright magnitudes ( — 19.7 < M1700 < —21.07). 
From the galaxies in this luminosity range t hey measured 
size e volution of (1 + 2: )- 1 - 0S ± - 21 to z ~ 6. iTruiillo et ail 
(2005) found a similar degree of evolution in their study, 
which used J, H and K-band imaging from the VLT FIRES 
data to probe z > 1 optical sizes. After dividing their sam- 
ple into bulge and disc-dominated system by global Sersic 
index (cut at n = 2.5), the authors compared the observed 
size distribution of high redshift objects to completeness 
limits derived from simulations. As the number of observed 
galaxies decreases more rapidly towards larger sizes than do 
the completeness limits, they argue that incompleteness is 
not biasing the data. Relative to the SDSS luminosity-size 
relation they find galaxies with Ly > 3.4 x 10 10 /i 7 " 2 //q at 
z ~ 2.5 are ~3.5 times smaller than for equally luminous 
galaxies today. 

Here in our study of the UDF we have used yet another 
approach to analysing selection biases and quantifying 
evolution in the galaxy LSD. Firstly, we have performed 
detailed artificial galaxy simulations to establish both the 
completeness limits and reliability limits of our data, noting 
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particularly the increasing systematic under-estimation 
of total flux by Kron magnitudes towards low surface 
brightnesses. And secondly, by using a local galaxy survey 
also with well defined selection boundaries (the MGC) we 
are able to identify a broad region of the LSD over which 
both samples are free of bias. This enabled us to establish 
clear evidence of evolution in mean surface brightness of the 
galaxy population for a range of over 8 mag in luminosity. 
Specifically, we found an increase of 1.0 mag arcsec -2 from 
2 ~ 0.1 to z = 0.675 for objects with -22 < Mb < —14 
mag. Assuming a linear trend with redshift we can extrap- 
olate this to a 1.05 mag arcsec -2 increase from z — to 
z ~ 0.7. This is in agreement w ith the 0.96 mag arc sec -2 
to z ~ 0.67 recently found by iBarden et al.l (I2005TI . and 
indeed with the ea rlier results of ISchade et alj ( 19951 ) and 
iRoche et alj (|l998f ). ft is also consistent with the surface 
brig htness evolutio n of ~ 1.2 mag arcsec -2 predicted by 
the iBouwens et alj l|2004 ) fit to z > 2.5 size evolution. 
In summary, we confirm the evolution in mean surface 
brightness of the bright galaxy population observed in these 
previous studies, and demonstrate that it holds down to 
Mb ~ — 14 mag. 

It shoul d be noted that our comparis on to the 
IBarden et ail (|2005l ) and ISchade et all (| 19951 ) results is 
imperfect because they specifically measured evolution of 
disc dominated systems, whereas we examine the galaxy 
population as a whole. As the UDF and MGC are both field 
galaxy surveys one would expect a majority of late-type 
systems. In fact, eyeball classification of the MGC sample 
within the selection boundaries defined by iDriver et al.l 
|2005l ) found ~34 per cent early-type systems. Our decision 
not to attempt a morphological subdivision of our sample 
meant we were able to avoid the added complications of 
choosing the most meaningful and robust criteria on which 
to classify galaxies at vastly different redshifts and from 
different datasets. However, this is somewhat of a limitation 
on our ability to use these results to distinguish between 
galaxy formation scenarios, which generally offer predictions 
for the evolution of specific morphological classes. In a 
future paper we will present a detailed structural analysis 
of the 169 objects in our z — 0.675 volume-limited sample, 
quantify evolution by galaxy type, and compare our findings 
to theoretical expectations. 



luminosity-size plane and presented the UDF galaxy LSD 
for a series of volume limited samples. By comparison to 
the LSD of the Millennium Galaxy Catalogue, a nearby 
galaxy survey with well defined selection limits, we iden- 
tified a region of the M-R e plane over which the z ~ 0.1 
and z ~ 0.7 galaxy populations can be compared free of 
selection biases. Evolution was quantified via a 2D K-S test 
and an increase of ~1.0 mag found for the average surface 
brightness of galaxies with luminosities Mb = —22 to —14 
mag. This is in agreement with the re sults of other recent 
studi es, such as IBouwens et~aH (|2004l ). IBarden et all (2005) 
and iTruiillo et alj (12005 L b ut con tradic ts the null evolution 
findin gs of Simard et alj (|l999T l and iRavindranath et al.l 
l|200l) . 

An important result to emerge from our artificial 
galaxy simulations was that surface brightness dependent 
measurement errors are a significant source of potential bias 
in the observed LSD. We found that, although exponential 
profile galaxies were detectable with a completeness of 75 
per cent down to ~28 mag arcsec -2 in the UDF i-band 
image, flux and scale sizes could only be recovered to 
within 25 per cent accuracy down to ~27 mag arcsec -2 
(based on SExtractor Kron magnitudes and sizes). As our 
error vector diagram indicates (see Fig. |SJ), the impact 
of these measurement errors is to recover extended, low 
surface brightness galaxies as faint, compact objects. If 
this effect is not accounted for the observed LSD will be 
be doubly biased — with an under representation of large, 
diffuse systems and an over abundance of small ones. This 
problem of recoverability is only likely to get worse towards 
higher z and must now be included in all analyses. 
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6 SUMMARY 

We constructed the iUDF-BRIGHT sample from all 
galaxies detected in the UDF ACS i-band image brighter 
than 28.0 mag. Half light radii were measured for these 
objects and photometric redshifts matched to each from 
the Mobasher catalogue. Individual K-corrections were 
computed using SED template fits to their observed B, V, 
i, z, J, H and K-hanA fluxes. This allowed the derivation 
of rest-frame, B-band absolute magnitudes and scale sizes 
using a Q Q = 0.3, fi A = 0.7 , H = 100 km s -1 Mpc -1 
cosmological model. Detailed artificial galaxy simulations 
were then used to establish detection completeness and 
measurement reliability limits in the observational ap- 
parent magnitude-size plane. We mapped these into the 
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